Running Title: Novel Insights into Turkey Hemorrhagic Enteritis Virus Transcriptome

Abraham Quaye1*, Bret Pickett*, Joel S. Griffitts*, Bradford K. Berges*, Brian D. Poole\({^\dagger}\)*

*Department of Microbiology and Molecular Biology, Brigham Young University
1First-author
\({^\dagger}\) Corresponding Author

Corresponding Author Information
brian_poole@byu.edu
Department of Microbiology and Molecular Biology,
4007 Life Sciences Building (LSB),
Brigham Young University,
Provo, Utah

ABSTRACT

Background: Hemorrhagic enteritis (HE) is a disease affecting 6-12-week-old turkeys characterized by immunosuppression (IS) and bloody diarrhea. This disease is caused by Turkey Hemorrhagic Enteritis Virus (THEV) of which avirulent strains (THEV-A) that do not cause HE but retain the immunosuppressive ability have been isolated. The THEV-A Virginia Avirulent Strain (VAS) is still used as a live vaccine despite its immunosuppressive properties. Our objective is to understand the genetic basis by which VAS induces IS. The transcriptome of THEV was studied to set the stage for further experimentation with specific viral genes that may mediate IS.
Methods: After infecting a turkey B-cell line (MDTC-RP19) with the VAS vaccine strain, samples in triplicates were collected at 4-, 12-, 24-, and 72-hours post-infection. Total RNA was subsequently extracted, and poly-A-tailed mRNA sequencing done. After trimming the raw sequencing reads with the FastQC, reads were mapped to the THEV genome using Hisat2 and transcripts assembled with StringTie. An in-house script was used to consolidate transcripts from all time-points, generating the final transcriptome. PCR, gel electrophoresis, and Sanger sequencing were used to validate all identified splice junctions.
Results and Conclusions: A total of 18.1 million reads mapped to THEV genome providing good coverage/depth, leaving no regions unmapped. All predicted genes in the genome were represented. In keeping with all adenoviruses, all transcripts were spliced with either with 5’- or 3’-multi exon UTRs hitherto unknown. Thirteen novel exons were identified which were validated by PCR and Sanger sequencing. The splicing patterns strongly suggest that there are three main promoters (E1, E3, and major late promoters) driving expression of most of the genes with two possible minor promoters driving single genes (ORF7 and ORF8). This RNA-sequencing experiment is the first study of THEV gene expression to date. In keeping with other Adenoviruses, almost all THEV genes are spliced, and several genes are expressed as one transcription unit under a single promoter. This insight into THEV’s transcriptome may allow the engineering of the VAS to provide immune protection with less or no associated IS.

INTRODUCTION

Adenoviruses (AdVs) are non-enveloped icosahedral-shaped DNA viruses, causing infection in virtually all vertebrates. Their double-stranded linear DNA genomes range between 26 and 45kb in size, producing a broad repertoire of transcripts via a highly complex alternative splicing pattern (1, 2). The AdV genome is one of the most optimally economized; both the forward and reverse DNA strands harbor protein-coding genes, making it highly gene-dense. There are 16 genes termed “genus-common” that are homologous in all AdVs; these are thought to be inherited from a common ancestor. All other genes are termed “genus-specific”. “Genus-specific” genes tend to be located at the termini of the genome while “genus-common” genes are usually central (1). This pattern is observed in Adenoviridae, Poxviridae, and Herpesviridae (1, 3, 4). The family Adenoviridae consists of five genera: Mastadenovirus (MAdV), Aviadenovirus, Atadenovirus, Ichtadenovirus, and Siadenovirus (SiAdV) (5, 6). Currently, there are three recognized members of the genus SiAdV: frog adenovirus 1, raptor adenovirus 1, and turkey adenovirus 3 also called turkey hemorrhagic enteritis virus (THEV) (5, 7–10). Members of SiAdV have the smallest genome size (~26 kb) and gene content (~23 genes) of all known AdVs, and many “genus-specific” putative genes of unknown functions have been annotated (see Figure 1)(1, 2, 7).

Virulent strains (THEV-V) and avirulent strains (THEV-A) of THEV are serologically indistinguishable, infecting turkeys, chickens, and pheasants and the THEV-V cause different clinical diseases in these birds (2, 11). In turkeys, the THEV-V cause hemorrhagic enteritis (HE), a debilitating acute disease affecting predominantly 6-12-week-old turkeys characterized by immunosuppression (IS), weight loss, intestinal lesions leading to bloody diarrhea, splenomegaly, and up to 80% mortality (11–13). HE is the most economically significant disease caused by any strain of THEV (11). While the current vaccine strain (a THEV-A isolated from a pheasant, Virginia Avirulent Strain [VAS]) have proven effective at preventing HE in young turkey poults, it still retains the immunosuppressive ability. Thus, vaccinated birds are rendered more susceptible to opportunistic infections and death than unvaccinated cohorts leading to substantial economic losses (11, 14–16). The induced IS also interferes with vaccination schemes for other infections of turkeys (11, 14). To eliminate this immunossupressive side-effect of the vaccine, a thorough investigation of the culprit viral factors (genes) mediating this phenomenon is essential. However, the transcriptome (splicing and gene expression patterns) of THEV has not been characterized, making the investigation of specific viral genes for possible roles in causing IS impractical. A well-characterized transcriptome of THEV is required to enable the next leap forward in THEV research - experimentation with specific viral genes that may mediate IS.

Myriads of studies have elucidated the AdV transcriptome in fine detail (17, 18). However, a large preponderance of studies focus on MAdVs - specifically human AdVs - thus, most of the current knowledge regarding AdV gene expression and replication is based on MAdV studies, which is generalized for all other AdVs (6, 19). MAdV genes are transcribed in a temporal manner; therefore, genes are categorized into five early transcription units (E1A, E1B, E2, E3, and E4), two intermediate (IM) units (pIX and IVa2), and one major late unit (MLTU), which generates five families of late mRNAs (L1-L5). An additional gene (UXP or U exon) is located on the reverse strand. The early genes encode non-structural proteins such as enzymes or host cell modulating proteins, primarily involved in DNA replication or providing the necessary intracellular niche for optimal replication while late genes encode structural proteins. The immediate early gene E1A is expressed first, followed by the the delayed early genes, E1B, E2, E3 and E4. Then the intermediate early genes, IVa2 and pIX are expressed followed by the late genes (6, 17, 18). MAdV makes an extensive use of alternative RNA splicing to produce a very complex array of mRNAs; all but pIX mRNA undergo at least one splicing event. The MLTU produces over 20 distinct splice variants all of which contain three non-coding exons at the 5’-end (collectively known as the tripartite leader, TPL) (17, 18). There is also an alternate 5’ three non-coding exons present in varying amounts on a subset of MLTU mRNAs (known as the x-, y- and z-leaders). Lastly, there is the i-leader exon, which is infrequently included between the second and third TPL exons, and codes for the i-leader protein (20). Thus, the MLTU produces a complex repertoire of mRNA with diverse 5’-UTRs spliced onto different 3’ coding exons which are grouped into five different 3’-end classes (L1-L5). Each transcription unit (TU) contains its own promoter driving the expression of all the array of mRNA transcripts produced via alternative splicing of the genes encoded in the unit(6, 17, 18). Almost all AdV mRNAs are generated by the excision of one or more introns and most of these introns are located in the 5’ or 3’ UTRs of pre-mRNA. Thus the viral introns scarcely interrupt the open reading frames (ORFs) (1, 18).

High throughput sequencing methods have facilitated the discovery of many novel transcribed regions and splicing isoforms. It is also a very powerful tool to study alternative splicing under different conditions at an unparalleled depth (18, 21). In this paper, a paired-end deep sequencing experiment was performed to characterize for the first time, the transcriptome of THEV (VAS vaccine strain) during different phases of the infection, yielding the first THEV splicing map. Our paired-end sequencing allowed for reading 149 bp long high quality (mean Phred Score of 36) sequences from each end of cDNA fragments, which were mapped to the genome of THEV. The generated data from our paired-end sequencing experiment should thus be reliable.

RESULTS

Overview of sequencing data and analysis pipeline outputs
A previous study by Zeinab et al showed that almost all THEV transcripts were detectable beginning at 4 hours (22). Therefore, infected MDTC-RP19 cells were harvested at 4-, 12-, 24-, and 72-hours post-infection(h.p.i) to ensure an amply wide time window to sample all transcripts. Our paired-end RNA sequencing (RNA-seq) experiment yielded an average of 107.1 million total reads of 149bp in length per time-point, which were simultaneously mapped to both the virus (THEV) and host (M.gallopavo) genomes using the Hisat2 (23) alignment program. A total of 18.1 million reads from all time-points mapped to the virus genome; this provided good coverage/depth, leaving no regions unmapped. The mapped reads to the virus genome increased substantially from 432 reads at 4 h.p.i to 16.9 million reads at 72 h.p.i (Table 1, Figure 2a). From the mapped reads, we identified an overall total of 2,457 unique THEV splice junctions from all time-points, with splice junctions from the later time-points being supported by significantly more sequence reads than earlier time-points. For example all the 13 unique junctions at 4 h.p.i had less than 10 reads supporting each one, averaging a mere 2.8 reads/junction. Conversely, the 2,374 unique junctions at 72 h.p.i averaged 898.4 reads/junction, some junctions having coverage as high as 322,677 reads. The substantial increases in splice junctions and mapping reads to the THEV genome over time denotes an active infection, and correlates with our qPCR assay quantifying the total number of viral genome copies over time (Figure 2b). Using StringTie (23), an assembler of RNA-seq alignments into potential transcripts, the mapped reads for each time-point were assembled into transcripts using the genomic location of the predicted THEV ORFs as a guide. In the final consolidated transcriptome of THEV, a composite of all unredundant transcripts from all time points, we counted a total of 28 transcripts all of which are novel. Although some exons in some transcripts match the predicted ORFs exactly, most of our identified exons are longer, spanning multiple predicted ORFs (Figure 3). The complete list of unique splice junctions mapped to THEV’s genome has been submitted to the National Center for Biotechnology Information Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) under accession no. XXXXXX.

Changes in THEV splice junction profile over time
AdV gene expression occurs under exquisite temporal control, supervised by designated promoters for each transcription unit (TU). Each promoter typically produce one or few pre-mRNAs that undergo alternative splicing to yield the manifold repertoire of complex transcripts characteristic of AdVs. To evaluate the activity of each promoter over time, Firstly, Ballgown - a program for statistical analysis of assembled transcriptomes, was used to estimate and normalize expression levels of all transcripts in Fragments Per Kilobase of transcript per Million mapped reads (FPKM) units over time. This revealed DBP - from the E2 region - to be the most significantly expressed at 12 h.p.i, whereas MLP region transcripts were most abundantly expressed at 24 and 72 h.p.i as expected (Figure 4a). Also, our analysis of the FPKM values of transcripts by region showed a similar pattern: the E2 region as the most significantly overexpressed at 12 h.p.i, after which the MLP region assumes predominance (Figure 4b). Secondly, we estimated relative abundances of all splice junctions for each time point, counting as significantly expressed only those with coverage of at least 1% of the total splice junction reads counted at the given time point. Very few unique splice junctions and reads were counted at 4 h.p.i; hence, this time point was excluded in this analysis. At 12 h.p.i, 20 junctions meet the 1% threshold, and were predominantly early region (E1, E2, and E3) junctions, albeit a significant proportion are MLP-derived (Table 2a). Junctions from the E2 region were the most abundant at this time point, constituting 42.6% of all junctions. The two most abundant junctions at 12 h.p.i were maintained at 24 h.p.i also as the most significantly expressed. However, the MLP-derived junctions were unsurprisingly the most preponderant overall at this time point, accounting for NA% of all the junction reads counted (Table 2b). At 72 h.p.i, the trend of increased activity of the MLP continued as expected; at this time, the MLP-derived junctions was not only the most abundant overall - accounting for 67.5% of all junctions reads - but also contained the most significantly expressed individual junctions (Table 2c. Also see Supplementary Tables 1a-c; Figure 5a). When we limited this analysis to only junctions in the final transcriptome, the relative abundances of the junctions for each region over time was substantially similar to the pattern seen with all the junctions included (Figure 5b). Furthermore, we analyzed splice donor and acceptor site nucleotide usage over time to investigate any peculiarities that THEV may show, generally or over the course of the infection. This revealed that, with the exception of one junction (which was a T-G donor-acceptor pair), all splice donor-acceptor pairs in the final transcriptome were the canonical T-A pair.

Early Region 1 (E1) transcripts. This region in MAdVs is the first transcribed after successful entry of the viral DNA into the nucleus of the host cell. The host transcription machinery is solely responsible for the transcription of this region. After their translation, the E1 proteins in concert with a myriad of host transcription factors activate the other viral promoters (6). We discovered four novel transcripts in this region with varying protein-coding potential (PCP).

and also validated the predicted ORF (sialidase; ORF1).

Early Region 2A (E2A) transcripts.

Early Region 2B (E2B) transcripts.

Early Region 3 (E3) transcripts.

Early Region 4 (E4) transcripts.

Intermediate Region (IM) transcripts.

Major Late Promoter Region (MLP) transcripts.

MATERIALS AND METHODS

Cell culture and THEV Infection

The Turkey B-cell line (MDTC-RP19, ATCC CRL-8135) was grown as suspension cultures in 1:1 complete Leibovitz’s L-15/McCoy’s 5A medium with 10% fetal bovine serum (FBS), 20% chicken serum (ChS), 5% tryptose phosphate broth (TPB), and 1% antibiotics solution (100 U/mL Penicillin and 100ug/mL Streptomycin), at 41oC in a humidified atmosphere with 5% CO2. Infected cells were maintained in 1:1 serum-reduced Leibovitz’s L15/McCoy’s 5A media (SRLM) with 2.5% FBS, 5% ChS, 1.2% TPB, and 1% antibiotics solution (100 U/mL Penicillin and 100ug/mL Streptomycin). A commercially available HE vaccine was purchased from Hygieia Biological Labs as a source of THEV-A (VAS strain). The stock virus was titrated using an in-house qPCR assay with titer expressed as genome copy number(GCN)/mL, similar to Mahshoub et al(24) with modifications. Cells were infected at a multiplicity of infection (MOI) of 100 GCN/cell and samples in triplicates were harvested at 4-, 12-, 24-, and 72-h.p.i for RNA-seq. The infection was repeated but samples in triplicates were harvested at 12-, 24-, 36-, 48-, and 72-h.p.i for PCR validation of novel splice sites.

RNA extraction and Sequencing

Total RNA was extracted from infected cells using Thermofishers’ RNAqueous™-4PCR Total RNA Isolation Kit (#AM1914) as per manufacturer’s instructions. An agarose gel electrophoresis was performed to check RNA integrity. The RNA quantity and purity was initially assessed using nanodrop, and RNA was used only if the A260/A280 ratio was 2.0 ± 0.05 and the A260/A230 ratio was >2 and <2.2. Extracted total RNA samples were sent to LC Sciences, Houston TX for poly-A-tailed mRNA sequencing where RNA integrity was checked with Agilent Technologies 2100 Bioanalyzer High Sensitivity DNA Chip and poly(A) RNA-seq library was prepared following Illumina's TruSeq-stranded-mRNA sample preparation protocol. Paired-end sequencing was performed on Illumina's NovaSeq 6000 sequencing system.

Computational Analysis of RNA Sequencing Data: Mapping and Transcript characterization

Analysis of our sequence reads were analyzed following a well established protocol described by Pertea et al (23), using SNAKEMAKE 7.24.0 to drive the pipeline. Briefly, sequencing reads were trimmed with the FastQC - version 0.11.9 (25) program to achieve an overall Mean Sequence Quality (Phred Score) of 36. Trimmed reads were mapped to the complete sequence of avirulent turkey hemorrhagic enteritis virus strain Virginia (https://www.ncbi.nlm.nih.gov/nuccore/AY849321.1/) and Meleagris gallopavo (https://www.ncbi.nlm.nih.gov/genome/?term=Meleagris+gallopavo) using Hisat2 - version 2.2.1 (23) with default settings without relying on known splice sites. The generated BAM files from each infection time-point were filtered for reads mapping to the THEV genome and fed into StringTie - version 2.2.1 (23) using a gff3 file from NCBI containing the predicted ORFs of THEV as a guide. A custom script was used to consolidate all transcripts from all time-points without redundancy, generating the final transcriptome of THEV.

Validation of Novel Splice Junctions

All splice junctions identified in this work are novel except one predicted splice site each for pTP and DBP, which were corroborated in our work. However, these predicted splice junctions have not been experimentally validated hitherto, and we identified additional novel splice junctions beside the predicted junctions, giving a more complete picture of the transcripts.

The novel splice junctions after consolidating all transcripts with StringTie which we validated by PCR and Sanger Sequencing are shown in Table###1. We designed primers that crossed a range of novel exon–exon boundaries for each specific transcript in a transcription unit with their respective universal primers (supplementary, PCR methods). Each forward primer contained a KpnI restriction site and reverse primers, an XbaI site. After first-strand cDNA synthesis with SuperScript™ III First-Strand Synthesis System (ThermoFisher SCIENTIFIC), these primers were used in a targeted PCR experiment, the PCR products were analysed on Agarose gels, cloned by traditional restriction enzyme method and Sanger sequenced to validate these splice junctions at the sequence level.

3’ Rapid Amplification of cDNA Ends (3’RACE)

DISCUSSION/CONCLUSIONS

SCRIPTS AND SUPPLEMENTARY MATERIALS

DATA AVAILABILITY

CODE AVAILABILITY

All the code/scripts written for analysis of the data is available on github (linkXXXXXX)

ACKNOWLEDGMENTS

LC Sciences Eton Bioscience, Inc, San Diego, CA

REFERENCES

1.
Davison A, Benko M, Harrach B. 2003. Genetic content and evolution of adenoviruses. The Journal of general virology 84:2895–908.
2.
Harrach B. 2008. Adenoviruses: General features, p. 1–9. In Mahy, BWJ, Van Regenmortel, MHV (eds.), Encyclopedia of virology (third edition). Book Section. Academic Press, Oxford.
3.
Upton C, Slack S, Hunter AL, Ehlers A, Roper RL. 2003. Poxvirus orthologous clusters: Toward defining the minimum essential poxvirus genome. Journal of virology 77:7590–7600.
4.
McGeoch D, Davison AJ. 1999. Chapter 17 - the molecular evolutionary history of the herpesviruses, p. 441–465. In Domingo, E, Webster, R, Holland, J (eds.), Origin and evolution of viruses. Book Section. Academic Press, London.
5.
Harrach B, Benko M, Both GW, Brown M, Davison AJ, Echavarría M, Hess M, Jones M, Kajon A, Lehmkuhl HD, Mautner V, Mittal S, Wadell G. 2011. Family adenoviridae. Virus Taxonomy: 9th Report of the International Committee on Taxonomy of Viruses 125–141.
6.
Guimet D, Hearing P. 2016. 3 - adenovirus replication, p. 59–84. In Curiel, DT (ed.), Adenoviral vectors for gene therapy (second edition). Book Section. Academic Press, San Diego.
7.
Kovács ER, Benkő M. 2011. Complete sequence of raptor adenovirus 1 confirms the characteristic genome organization of siadenoviruses. Infection, Genetics and Evolution 11:1058–1065.
8.
Davison AJ, Wright KM, Harrach B. 2000. DNA sequence of frog adenovirus. J Gen Virol 81:2431–2439.
9.
10.
Katoh H, Ohya K, Kubo M, Murata K, Yanai T, Fukushi H. 2009. A novel budgerigar-adenovirus belonging to group II avian adenovirus of siadenovirus. Virus Research 144:294–297.
11.
12.
Beach NM, Duncan RB, Larsen CT, Meng XJ, Sriranganathan N, Pierson FW. 2009. Comparison of 12 turkey hemorrhagic enteritis virus isolates allows prediction of genetic factors affecting virulence. J Gen Virol 90:1978–85.
13.
Gross WB, Moore WE. 1967. Hemorrhagic enteritis of turkeys. Avian Dis 11:296–307.
14.
Rautenschlein S, Sharma JM. 2000. Immunopathogenesis of haemorrhagic enteritis virus (HEV) in turkeys. Dev Comp Immunol 24:237–46.
15.
Larsen CT, Domermuth CH, Sponenberg DP, Gross WB. 1985. Colibacillosis of turkeys exacerbated by hemorrhagic enteritis virus. Laboratory studies. Avian Dis 29:729–32.
16.
Dhama K, Gowthaman V, Karthik K, Tiwari R, Sachan S, Kumar MA, Palanivelu M, Malik YS, Singh RK, Munir M. 2017. Haemorrhagic enteritis of turkeys – current knowledge. Veterinary Quarterly 37:31–42.
17.
Donovan-Banfield I, Turnell AS, Hiscox JA, Leppard KN, Matthews DA. 2020. Deep splicing plasticity of the human adenovirus type 5 transcriptome drives virus evolution. Communications Biology 3:124.
18.
Zhao H, Chen M, Pettersson U. 2014. A new look at adenovirus splicing. Virology 456-457:329–341.
19.
Wolfrum N, Greber UF. 2013. Adenovirus signalling in entry. Cell Microbiol 15:53–62.
20.
Falvey E, Ziff E. 1983. Sequence arrangement and protein coding capacity of the adenovirus type 2 "i" leader. Journal of Virology 45:185–191.
21.
Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alioto T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, Derrien T, Drenkow J, Dumais E, Dumais J, Duttagupta R, Falconnet E, Fastuca M, Fejes-Toth K, Ferreira P, Foissac S, Fullwood MJ, Gao H, Gonzalez D, Gordon A, Gunawardena H, Howald C, Jha S, Johnson R, Kapranov P, King B, Kingswood C, Luo OJ, Park E, Persaud K, Preall JB, Ribeca P, Risk B, Robyr D, Sammeth M, Schaffer L, See L-H, Shahab A, Skancke J, Suzuki AM, Takahashi H, Tilgner H, Trout D, Walters N, Wang H, Wrobel J, Yu Y, Ruan X, Hayashizaki Y, Harrow J, Gerstein M, Hubbard T, Reymond A, Antonarakis SE, Hannon G, Giddings MC, Ruan Y, Wold B, Carninci P, Guigó R, Gingeras TR. 2012. Landscape of transcription in human cells. Nature 489:101–108.
22.
Aboezz Z, Mahsoub H, El-Bagoury G, Pierson F. 2019. In vitro growth kinetics and gene expression analysis of the turkey adenovirus 3, a siadenovirus. Virus Research 263:47–54.
23.
Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. 2016. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and ballgown. Nature Protocols 11:1650–1667.
24.
Mahsoub HM, Evans NP, Beach NM, Yuan L, Zimmerman K, Pierson FW. 2017. Real-time PCR-based infectivity assay for the titration of turkey hemorrhagic enteritis virus, an adenovirus, in live vaccines. Journal of Virological Methods 239:42–49.
25.
2015. FastQC.

TABLES AND FIGURES

Figure 1. Genomic map of THEV avirulent strain. The central horizontal line represents the double-stranded DNA marked at 5kb intervals as white line breaks. Blocks represent viral genes. Blocks above the DNA line are transcribed rightward, those below are transcribed leftward. pTP, DBP and 33K predicted to be spliced are shown as having tails. Shaded regions indicate regions containing “genus-specific” genes (colored red). Genes colored in blue are “genus-common”. Gene colored in light green is conserved in all but Atadenoviruses. The UXP (light blue) is an incomplete gene present in almost all AdVs. Regions comprising the different transcription units are labelled at the bottom (E1, E2A, E2B, E3, E4, and IM); the unlabeled regions comprise the MLTU.

Metric 4h.p.i 12h.p.i 24h.p.i 72h.p.i Total

Total reads

1.17e+08

7.63e+07

1.20e+08

1.15e+08

4.28e+08

Mapped
(Host)

1.04e+08
(89.06%)

6.79e+07
(89.0393%)

1.06e+08
(88.2719%)

8.38e+07
(72.9802%)

3.62e+08

Mapped
(THEV)

4.32e+02
( 0.0004%)

6.70e+03
( 0.0088%)

1.18e+06
( 0.9841%)

1.69e+07
(14.6904%)

1.81e+07

Mean Per Base
Coverage/Depth

2.42

37.71

6,666.96

95,041.7

101,749

Total unique
splice junctions

13

37

236

2,374

2,457

Junction coverage
Total (≥ 1 read)

37

605

115,075

2.13e+06

2.25e+06

Junction coverage
Mean reads

2.8

16.4

487.6

898.4

351.3

Junction coverage
≥ 10 reads

0

13

132

1,791

1,936

Junction coverage
≥ 100 reads

0

1

53

805

859

Junction coverage
≥ 1000 reads

0

0

18

168

186

Figure 2. a) Per base coverage of sequence reads mapping to THEV genome by time point. The pileup of mRNA reads mapping to THEV genome at the base-pair level for each indicated time point show remarkable difference in terms of quantities. There is a dramatic increase of mean coverage/depth from 2.42 at 4 h.p.i to 95,042 at 72 h.p.i, strongly demonstrating an active infection. Unexpectedly, the pileup of reads seems consistently skewed over similar regions of the genome. We could speculate that the temporal gene expression regulation of THEV is different from MAdVs or this could simply mean that the infection was not well synchronized. However, the relative proportions over these similar regions shows some variation over time. b) One-step growth of THEV (VAS vaccine strain) in MDTC-RP19 cell line. After infecting cells at an MOI of 100 GCN/cell, triplicates of harvested infected cells were quantified with an in-house qPCR assay measuring the total copies of THEV genome. There is no discernible increase in virus titer up 12 h.p.i, after which there is a steady increase in virus titer is measured. The virus titer expands exponentially beginning from 48 h.p.i, increasing by orders of magnitude before reaching a plateau at 120 h.p.i, probably due to high cell death. GCN: genome copy number.

Figure 3. a) Full transcriptome of THEV. THEV transcripts assembled from all time points by StringTie are unified forming this final transcriptome (splicing map). Transcripts belonging to the same transcription unit (TU) are located in close proximity on the genome and are color coded and labeled in this figure as such. The organization of TUs in the THEV genome is unsurprisingly similar to MAdVs; however, the MAdV genome shows significantly more transcripts. The TUs are color coded: E1 transcripts - red, E2 - black, E3 - dark grey, E4 - green, MLP - blue. Predicted ORFs are also indicated here, colored light grey. b) THEV transcripts identified at given time points. Transcripts are color coded as explained in a.

Table 2A

Timepoint Strand Start End Splice_Site Splice Acceptor-Donor Region Reads Intron Length Reads_Percentage
12hpi - 18189 18684 GT-AG T-A E2 111 495 bp 111 (18.3%)
12hpi - 18087 18159 GT-AG T-A E2 103 72 bp 103 (17%)
12hpi + 7531 7754 GT-AG T-A MLP 58 223 bp 58 (9.6%)
12hpi - 25701 26055 GT-AG T-A E4 37 354 bp 37 (6.1%)
12hpi + 20223 20419 GT-AG T-A E3 33 196 bp 33 (5.5%)
12hpi + 4360 7454 GT-AG T-A MLP 32 3,094 bp 32 (5.3%)
12hpi - 18751 20668 GT-AG T-A E2 23 1,917 bp 23 (3.8%)
12hpi + 18350 18717 GT-AG T-A E3 21 367 bp 21 (3.5%)
12hpi + 18768 20162 GT-AG T-A E3 21 1,394 bp 21 (3.5%)
12hpi - 21735 22264 GT-AG T-A NA 20 529 bp 20 (3.3%)
12hpi + 7807 13610 GT-AG T-A MLP 18 5,803 bp 18 (3%)
12hpi + 18350 20162 GT-AG T-A E3 18 1,812 bp 18 (3%)
12hpi - 18751 21682 GT-AG T-A E2 12 2,931 bp 12 (2%)
12hpi + 24602 25524 CT-AC T-A NA 11 922 bp 11 (1.8%)
12hpi + 304 1616 GT-AG T-A E1 9 1,312 bp 9 (1.5%)
12hpi + 1655 1964 GT-AG T-A E1 9 309 bp 9 (1.5%)
12hpi - 22540 24500 GT-AG T-A NA 9 1,960 bp 9 (1.5%)
12hpi - 18087 18163 GT-AG T-A E2 8 76 bp 8 (1.3%)
12hpi + 7807 12238 GT-AG T-A MLP 7 4,431 bp 7 (1.2%)
12hpi + 7807 22492 GT-AG T-A MLP 6 14,685 bp 6 (1%)

Table 2B

Timepoint Strand Start End Splice_Site Splice Acceptor-Donor Region Reads Intron Length Reads_Percentage
24hpi - 18087 18159 GT-AG T-A E2 18,826 72 bp 18,826 (16.4%)
24hpi - 18189 18684 GT-AG T-A E2 18,794 495 bp 18,794 (16.3%)
24hpi + 7531 7754 GT-AG T-A MLP 12,319 223 bp 12,319 (10.7%)
24hpi + 20223 20419 GT-AG T-A E3 10,583 196 bp 10,583 (9.2%)
24hpi + 4360 7454 GT-AG T-A MLP 7,128 3,094 bp 7,128 (6.2%)
24hpi + 18350 20162 GT-AG T-A E3 6,619 1,812 bp 6,619 (5.8%)
24hpi + 18768 20162 GT-AG T-A E3 5,207 1,394 bp 5,207 (4.5%)
24hpi - 18751 20668 GT-AG T-A E2 4,123 1,917 bp 4,123 (3.6%)
24hpi + 18350 18717 GT-AG T-A E3 3,930 367 bp 3,930 (3.4%)
24hpi + 7807 13610 GT-AG T-A MLP 2,553 5,803 bp 2,553 (2.2%)
24hpi + 7807 12238 GT-AG T-A MLP 2,446 4,431 bp 2,446 (2.1%)
24hpi + 7807 22492 GT-AG T-A MLP 1,642 14,685 bp 1,642 (1.4%)
24hpi + 1655 1964 GT-AG T-A E1 1,395 309 bp 1,395 (1.2%)
24hpi + 7807 18717 GT-AG T-A MLP 1,391 10,910 bp 1,391 (1.2%)
24hpi - 18751 21128 GT-AG T-A E2 1,124 2,377 bp 1,124 (1%)
24hpi + 20223 20894 GT-AG T-A E3 1,208 671 bp 1,208 (1%)

Table 2C

Timepoint Strand Start End Splice_Site Splice Acceptor-Donor Region Reads Intron Length Reads_Percentage
72hpi + 7531 7754 GT-AG T-A MLP 322,677 223 bp 322,677 (15.1%)
72hpi + 4360 7454 GT-AG T-A MLP 179,607 3,094 bp 179,607 (8.4%)
72hpi - 18087 18159 GT-AG T-A E2 161,488 72 bp 161,488 (7.6%)
72hpi - 18189 18684 GT-AG T-A E2 156,037 495 bp 156,037 (7.3%)
72hpi + 20223 20419 GT-AG T-A E3 93,238 196 bp 93,238 (4.4%)
72hpi + 7807 13610 GT-AG T-A MLP 81,434 5,803 bp 81,434 (3.8%)
72hpi + 7807 12238 GT-AG T-A MLP 77,617 4,431 bp 77,617 (3.6%)
72hpi + 18768 20162 GT-AG T-A E3 45,062 1,394 bp 45,062 (2.1%)
72hpi + 1655 1964 GT-AG T-A E1 38,491 309 bp 38,491 (1.8%)
72hpi + 18350 20162 GT-AG T-A E3 38,841 1,812 bp 38,841 (1.8%)
72hpi + 18350 18717 GT-AG T-A E3 35,490 367 bp 35,490 (1.7%)
72hpi - 18751 20668 GT-AG T-A E2 28,454 1,917 bp 28,454 (1.3%)
72hpi + 304 1616 GT-AG T-A E1 25,041 1,312 bp 25,041 (1.2%)
72hpi + 7807 12904 GT-AG T-A MLP 21,948 5,097 bp 21,948 (1%)
72hpi + 7807 22492 GT-AG T-A MLP 21,891 14,685 bp 21,891 (1%)

Figure 4 Overview of change in expression of THEV transcripts over time. a) b)

Figure 5. a) Relative abundances of splice junctions over time as a proxy of estimating activity levels of THEV promoters. ## Description here ##########. b) Relative abundances of junctions in final transcriptome. ##Description here #######.